Journal article

Principled Graph Matching Algorithms for Integrating Multiple Data Sources

D Zhang, BIP Rubinstein, J Gemmell

IEEE Transactions on Knowledge and Data Engineering | Published : 2015

Abstract

This paper explores combinatorial optimization for problems of max-weight graph matching on multi-partite graphs, which arise in integrating multiple data sources. In the most common two-source case, it is often desirable for the final matching to be one-to-one; the database and statistical record linkage communities accomplish this by weighted bipartite graph matching on similarity scores. Such matchings are intuitively appealing: they leverage a natural global property of many real-world entity stores - that of being nearly deduped - and are known to provide significant improvements to precision and recall. Unfortunately, unlike the bipartite case, exact max-weight matching on multi-partit..

View full abstract

University of Melbourne Researchers